The Best 1357 Text Embedding Tools in 2025

Jina Embeddings V3
Jina Embeddings V3 is a multilingual sentence embedding model supporting over 100 languages, specializing in sentence similarity and feature extraction tasks.
Text Embedding Transformers Supports Multiple Languages
J
jinaai
3.7M
911
Ms Marco MiniLM L6 V2
Apache-2.0
A cross-encoder model trained on the MS Marco passage ranking task for query-passage relevance scoring in information retrieval
Text Embedding English
M
cross-encoder
2.5M
86
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
A sparse retrieval model based on distillation technology, optimized for OpenSearch, supporting inference-free document encoding with improved search relevance and efficiency over V1
Text Embedding Transformers English
O
opensearch-project
1.8M
7
Sapbert From PubMedBERT Fulltext
Apache-2.0
A biomedical entity representation model based on PubMedBERT, optimized for semantic relation capture through self-aligned pre-training
Text Embedding English
S
cambridgeltl
1.7M
49
Gte Large
MIT
GTE-Large is a powerful sentence transformer model focused on sentence similarity and text embedding tasks, excelling in multiple benchmark tests.
Text Embedding English
G
thenlper
1.5M
278
Gte Base En V1.5
Apache-2.0
GTE-base-en-v1.5 is an English sentence transformer model focused on sentence similarity tasks, excelling in multiple text embedding benchmarks.
Text Embedding Transformers Supports Multiple Languages
G
Alibaba-NLP
1.5M
63
Gte Multilingual Base
Apache-2.0
GTE Multilingual Base is a multilingual sentence embedding model supporting over 50 languages, suitable for tasks like sentence similarity calculation.
Text Embedding Transformers Supports Multiple Languages
G
Alibaba-NLP
1.2M
246
Polybert
polyBERT is a chemical language model designed to achieve fully machine-driven ultrafast polymer informatics. It maps PSMILES strings into 600-dimensional dense fingerprints to numerically represent polymer chemical structures.
Text Embedding Transformers
P
kuelumbus
1.0M
5
Bert Base Turkish Cased Mean Nli Stsb Tr
Apache-2.0
A sentence embedding model based on Turkish BERT, optimized for semantic similarity tasks
Text Embedding Transformers Other
B
emrecan
1.0M
40
GIST Small Embedding V0
MIT
A text embedding model fine-tuned based on BAAI/bge-small-en-v1.5, trained with the MEDI dataset and MTEB classification task datasets, optimized for query encoding in retrieval tasks.
Text Embedding English
G
avsolatorio
945.68k
29
Gte Large En V1.5
Apache-2.0
GTE-Large is a high-performance English text embedding model that excels in multiple text similarity and classification tasks.
Text Embedding Transformers Supports Multiple Languages
G
Alibaba-NLP
891.76k
213
Snowflake Arctic Embed M
Apache-2.0
Snowflake Arctic Embed M is a sentence transformer model focused on sentence similarity tasks, capable of efficiently extracting text features and calculating similarity between sentences.
Text Embedding Transformers
S
Snowflake
722.08k
154
Splade Cocondenser Ensembledistil
SPLADE model for passage retrieval, improving sparse neural information retrieval through knowledge distillation
Text Embedding Transformers English
S
naver
606.73k
42
Text2vec Base Chinese
Apache-2.0
A Chinese text embedding model based on the CoSENT (Cosine Sentence) model, which can map sentences to a 768-dimensional dense vector space and is suitable for tasks such as sentence embedding, text matching, or semantic search.
Text Embedding Chinese
T
shibing624
605.98k
718
Rubert Tiny2
MIT
A compact BERT-based Russian encoder capable of generating high-quality sentence embeddings
Text Embedding Transformers Other
R
cointegrated
585.48k
135
Ms Marco MiniLM L2 V2
Apache-2.0
A cross-encoder model trained on the MS Marco passage ranking task for query-passage relevance scoring in information retrieval.
Text Embedding English
M
cross-encoder
533.42k
11
Ruri Base
Apache-2.0
Ruri is a universal text embedding model for Japanese, focusing on sentence similarity and feature extraction tasks.
Text Embedding Japanese
R
cl-nagoya
523.56k
9
KR SBERT V40K Kluenli Augsts
This is a Korean sentence embedding model based on sentence-transformers, capable of mapping sentences and paragraphs into a 768-dimensional dense vector space, suitable for tasks such as clustering or semantic search.
Text Embedding Transformers Korean
K
snunlp
500.73k
61
Gte Small
MIT
GTE-small is a general text embedding model trained by Alibaba DAMO Academy, based on the BERT framework, suitable for tasks such as information retrieval and semantic text similarity.
Text Embedding Transformers English
G
Supabase
481.27k
89
Ms Marco MiniLM L12 V2
Apache-2.0
A cross-encoder model trained on the MS Marco passage ranking task for relevance ranking in information retrieval.
Text Embedding English
M
cross-encoder
469.35k
71
All Minilm L6 V2 With Attentions
Apache-2.0
This is an ONNX port of sentence-transformers/all-MiniLM-L6-v2, adjusted to return attention weights, specifically designed for BM42 search scenarios.
Text Embedding Transformers English
A
Qdrant
450.93k
10
Gte Small
MIT
GTE-small is a compact general-purpose text embedding model suitable for various natural language processing tasks, including sentence similarity calculation, text classification, and retrieval.
Text Embedding English
G
thenlper
450.86k
158
Sbert Large Nlu Ru
MIT
This is a large Russian language model based on the BERT architecture, specifically designed for generating sentence embeddings with case-insensitive processing support.
Text Embedding Transformers Other
S
ai-forever
386.96k
84
Labse En Ru
A streamlined version of the LaBSE model specialized for English and Russian, significantly reducing model size while preserving original embedding quality
Text Embedding Transformers Supports Multiple Languages
L
cointegrated
375.34k
51
Sentence Similarity Spanish Es
Apache-2.0
This is a Spanish sentence similarity calculation model based on sentence-transformers, capable of mapping sentences and paragraphs into a 768-dimensional vector space.
Text Embedding Transformers Spanish
S
hiiamsid
349.51k
48
Roberta Base Bne Finetuned Msmarco Qa Es Mnrl Mn
Apache-2.0
This is a Spanish-based sentence-transformers model specifically designed for question-answering scenarios, capable of mapping sentences and paragraphs into a 768-dimensional vector space, suitable for semantic search and clustering tasks.
Text Embedding Spanish
R
dariolopez
347.38k
5
USER Bge M3
Apache-2.0
Russian universal sentence encoder, based on the sentence-transformers framework, specifically designed to extract 1024-dimensional dense vectors for Russian text
Text Embedding Other
U
deepvk
339.46k
58
Bge Small En V1.5 Onnx Q
Apache-2.0
Quantized ONNX version of the BAAI/bge-small-en-v1.5 model for text classification and similarity search.
Text Embedding Transformers
B
Qdrant
329.03k
1
Gte Base
MIT
GTE-Base is a general-purpose text embedding model focused on sentence similarity and text retrieval tasks, performing well on multiple benchmarks.
Text Embedding English
G
thenlper
317.05k
117
Bge M3 Onnx O4
MIT
This is the ONNX quantized version of the BAAI/bge-m3 model, supporting three functionalities: dense retrieval, multi-vector retrieval, and sparse retrieval, covering over 100 languages.
Text Embedding Transformers
B
hooman650
285.96k
10
Sup Simcse Roberta Large
Supervised SimCSE model based on RoBERTa-large for sentence embedding and feature extraction tasks.
Text Embedding
S
princeton-nlp
276.47k
25
GIST Embedding V0
MIT
GIST-Embedding-v0 is a sentence embedding model based on sentence-transformers, mainly used for sentence similarity calculation and feature extraction tasks.
Text Embedding English
G
avsolatorio
252.21k
26
Bge Micro V2
MIT
bge_micro is a lightweight model focused on sentence similarity calculation, suitable for various natural language processing tasks.
Text Embedding Transformers
B
TaylorAI
248.53k
46
Ms Marco TinyBERT L2 V2
Apache-2.0
A lightweight cross-encoder trained on the MS Marco passage ranking task for query-passage relevance scoring in information retrieval
Text Embedding English
M
cross-encoder
247.59k
25
Sapbert From PubMedBERT Fulltext Mean Token
Biomedical entity representation model based on PubMedBERT, optimized for semantic relation capture through self-alignment pre-training
Text Embedding
S
cambridgeltl
244.39k
0
Nomic Embed Text V2 Moe
Apache-2.0
Nomic Embed v2 is a high-performance multilingual Mixture of Experts (MoE) text embedding model supporting approximately 100 languages, excelling in multilingual retrieval tasks.
Text Embedding Supports Multiple Languages
N
nomic-ai
242.32k
357
Gte Qwen2 1.5B Instruct
Apache-2.0
A general-purpose text embedding model based on Qwen2-1.5B, supporting multilingual and long-text processing
Text Embedding Transformers
G
Alibaba-NLP
242.12k
207
Gte Multilingual Reranker Base
Apache-2.0
The first multilingual reranking model in the GTE series, supporting 70+ languages with high performance and long text processing capabilities.
Text Embedding Transformers Supports Multiple Languages
G
Alibaba-NLP
239.91k
122
Amber Large
Apache-2.0
A Japanese-English bilingual sentence feature extraction model based on modernbert-ja-310m, supporting sentence similarity computation and text classification tasks
Text Embedding Supports Multiple Languages
A
retrieva-jp
239.28k
7
Mmlw Retrieval Roberta Large
Apache-2.0
MMLW (I Must Get Better Messages) is a neural text encoder for Polish, optimized for information retrieval tasks.
Text Embedding Transformers Other
M
sdadas
237.90k
12
Ms Marco MiniLM L4 V2
Apache-2.0
A cross-encoder model trained on the MS Marco passage ranking task for scoring query-passage relevance in information retrieval
Text Embedding English
M
cross-encoder
234.18k
10
Snowflake Arctic Embed L V2.0
Apache-2.0
Snowflake Arctic Embed v2.0 is a multilingual sentence embedding model that supports text feature extraction and sentence similarity calculation for over 100 languages.
Text Embedding Transformers Supports Multiple Languages
S
Snowflake
231.00k
156
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase